内窥镜检查是空心器官内最广泛使用的癌症和息肉检测的医疗技术。但是,由于启蒙源方向,内窥镜获得的图像经常受到照明人工制品的影响。当内窥镜的光源姿势突然变化时,存在两个主要问题:产生过度曝光和不受欢迎的组织区域。这两种情况可能导致因影响区域缺乏信息而导致误诊,或者在非侵入性检查过程中使用了各种计算机视觉方法的性能(例如,大满贯,运动结构,光流,光流)。这项工作的目的是两倍:i)引入一种由生成对抗技术生成的新合成生成的数据集和ii),并探索在过度暴露和未渗透的照明中探索基于浅层和深度学习的基于浅的基于学习的图像增强方法条件。除了在7.6 fps左右的运行时间外,还通过基于深网的LMSPEC方法获得了最佳定量结果(即基于公制的结果)
translated by 谷歌翻译
在此贡献中,我们使用一种合奏深度学习方法来组合两个单个单阶段探测器(即Yolov4和Yolact)的预测,目的是检测内窥镜图像中的伪像。这种整体策略使我们能够改善各个模型的鲁棒性,而无需损害其实时计算功能。我们通过训练和测试两个单独的模型和各种集合配置在“内窥镜伪影检测挑战”数据集中证明了方法的有效性。广泛的实验表明,在平均平均精度方面,合奏方法比单个模型和以前的作品的优越性。
translated by 谷歌翻译
在结肠息肉是众所周知的如通过结肠镜检查鉴定的癌症的前体或者有关诊断工作为症状,结肠直肠癌筛查或某些疾病的系统的监视。虽然大部分息肉是良性的,在数量,尺寸和息肉的表面结构是紧密相连的结肠癌的风险。有高的漏检率和不完全去除结肠息肉的存在由于可变性质,困难描绘异常,高复发率和结肠的解剖外形。过去,多种方法已建成自动化息肉检测与分割。然而,大多数方法的关键问题是,他们没有经过严格的大型多中心的专用数据集进行测试。因此,这些方法可能无法推广到不同人群的数据集,因为他们过度拟合到一个特定的人口和内镜监控。在这个意义上,我们已经从整合超过300名患者6个不同的中心策划的数据集。所述数据集包括与由六名高级肠胃验证息肉边界的精确划定3446个注释息肉标签单帧和序列数据。据我们所知,这是由一组计算科学家和专家肠胃的策划最全面的检测和像素级的细分数据集。此数据集已在起源的Endocv2021挑战旨在息肉检测与分割处理可推广的一部分。在本文中,我们提供全面的洞察数据结构和注释策略,标注的质量保证和技术验证我们的扩展EndoCV2021数据集,我们称之为PolypGen。
translated by 谷歌翻译
In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region.
translated by 谷歌翻译
Event-based simulations of Spiking Neural Networks (SNNs) are fast and accurate. However, they are rarely used in the context of event-based gradient descent because their implementations on GPUs are difficult. Discretization with the forward Euler method is instead often used with gradient descent techniques but has the disadvantage of being computationally expensive. Moreover, the lack of precision of discretized simulations can create mismatches between the simulated models and analog neuromorphic hardware. In this work, we propose a new exact error-backpropagation through spikes method for SNNs, extending Fast \& Deep to multiple spikes per neuron. We show that our method can be efficiently implemented on GPUs in a fully event-based manner, making it fast to compute and precise enough for analog neuromorphic hardware. Compared to the original Fast \& Deep and the current state-of-the-art event-based gradient-descent algorithms, we demonstrate increased performance on several benchmark datasets with both feedforward and convolutional SNNs. In particular, we show that multi-spike SNNs can have advantages over single-spike networks in terms of convergence, sparsity, classification latency and sensitivity to the dead neuron problem.
translated by 谷歌翻译
The error Backpropagation algorithm (BP) is a key method for training deep neural networks. While performant, it is also resource-demanding in terms of computation, memory usage and energy. This makes it unsuitable for online learning on edge devices that require a high processing rate and low energy consumption. More importantly, BP does not take advantage of the parallelism and local characteristics offered by dedicated neural processors. There is therefore a demand for alternative algorithms to BP that could improve the latency, memory requirements, and energy footprint of neural networks on hardware. In this work, we propose a novel method based on Direct Feedback Alignment (DFA) which uses Forward-Mode Automatic Differentiation to estimate backpropagation paths and learn feedback connections in an online manner. We experimentally show that Directional DFA achieves performances that are closer to BP than other feedback methods on several benchmark datasets and architectures while benefiting from the locality and parallelization characteristics of DFA. Moreover, we show that, unlike other feedback learning algorithms, our method provides stable learning for convolution layers.
translated by 谷歌翻译
The transition angles are defined to describe the vowel-to-vowel transitions in the acoustic space of the Spectral Subband Centroids, and the findings show that they are similar among speakers and speaking rates. In this paper, we propose to investigate the usage of polar coordinates in favor of angles to describe a speech signal by characterizing its acoustic trajectory and using them in Automatic Speech Recognition. According to the experimental results evaluated on the BRAF100 dataset, the polar coordinates achieved significantly higher accuracy than the angles in the mixed and cross-gender speech recognitions, demonstrating that these representations are superior at defining the acoustic trajectory of the speech signal. Furthermore, the accuracy was significantly improved when they were utilized with their first and second-order derivatives ($\Delta$, $\Delta$$\Delta$), especially in cross-female recognition. However, the results showed they were not much more gender-independent than the conventional Mel-frequency Cepstral Coefficients (MFCCs).
translated by 谷歌翻译
This technical report presents GPS++, the first-place solution to the Open Graph Benchmark Large-Scale Challenge (OGB-LSC 2022) for the PCQM4Mv2 molecular property prediction task. Our approach implements several key principles from the prior literature. At its core our GPS++ method is a hybrid MPNN/Transformer model that incorporates 3D atom positions and an auxiliary denoising task. The effectiveness of GPS++ is demonstrated by achieving 0.0719 mean absolute error on the independent test-challenge PCQM4Mv2 split. Thanks to Graphcore IPU acceleration, GPS++ scales to deep architectures (16 layers), training at 3 minutes per epoch, and large ensemble (112 models), completing the final predictions in 1 hour 32 minutes, well under the 4 hour inference budget allocated. Our implementation is publicly available at: https://github.com/graphcore/ogb-lsc-pcqm4mv2.
translated by 谷歌翻译
Since the mid-10s, the era of Deep Learning (DL) has continued to this day, bringing forth new superlatives and innovations each year. Nevertheless, the speed with which these innovations translate into real applications lags behind this fast pace. Safety-critical applications, in particular, underlie strict regulatory and ethical requirements which need to be taken care of and are still active areas of debate. eXplainable AI (XAI) and privacy-preserving machine learning (PPML) are both crucial research fields, aiming at mitigating some of the drawbacks of prevailing data-hungry black-box models in DL. Despite brisk research activity in the respective fields, no attention has yet been paid to their interaction. This work is the first to investigate the impact of private learning techniques on generated explanations for DL-based models. In an extensive experimental analysis covering various image and time series datasets from multiple domains, as well as varying privacy techniques, XAI methods, and model architectures, the effects of private training on generated explanations are studied. The findings suggest non-negligible changes in explanations through the introduction of privacy. Apart from reporting individual effects of PPML on XAI, the paper gives clear recommendations for the choice of techniques in real applications. By unveiling the interdependencies of these pivotal technologies, this work is a first step towards overcoming the remaining hurdles for practically applicable AI in safety-critical domains.
translated by 谷歌翻译
在科学计算的许多领域越来越流行的人工神经网络(ANN)的大量使用迅速增加了现代高性能计算系统的能源消耗。新型的神经形态范式提供了一种吸引人的替代方案,它直接在硬件中实施了ANN。但是,对于科学计算中用例使用ANN在神经形态硬件上运行ANN的实际好处知之甚少。在这里,我们提出了一种方法,用于测量使用常规硬件的ANN来计算推理任务的时间。此外,我们为这些任务设计了一个体系结构,并根据最先进的模拟内存计算(AIMC)平台估算了相同的指标,这是神经形态计算中的关键范例之一。在二维凝结物质系统中的量子多体物理学中的用例比较两种方法,并在粒子物理学中大型强子对撞机上以40 MHz的速率以40 MHz的速率进行异常检测。我们发现,与传统硬件相比,AIMC最多可以达到一个较短的计算时间,最高三个数量级的能源成本。这表明使用神经形态硬件进行更快,更可持续的科学计算的潜力。
translated by 谷歌翻译